Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvester / Add support for Zenodo and Pangaea #8574

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

fxprunayre
Copy link
Member

Zenodo (https://zenodo.org/) is an "Open repository for EU-funded research outputs from Horizon Europe, Euratom and earlier Framework Programmes.".

PANGAEA (https://www.pangaea.de/) is a data publisher. "Our services are open for archiving, publishing, and distributing georeferenced data from earth system research."

Add import from both provider by adding a conversion to ISO19115-3. Zenodo JSON export and Pangaea pan_md output are used as input.

Sample input:

Conversions can be used to harvest records using the simple URL harvester.

Pangaea configuration

{"@id":"603","@type":"simpleurl","owner":["1"],"ownerGroup":["undefined"],"ownerUser":[[]],"site":{"name":"pangaea","uuid":"613eeff0-e8ee-45ea-ac6d-e067f661934f","account":{"use":false,"username":[],"password":[]},"url":"https://ws.pangaea.de/oai/provider?verb=GetRecord&identifier=oai:pangaea.de:doi:10.1594/PANGAEA.820342&metadataPrefix=pan_md","icon":"blank.png","loopElement":".//*[local-name() = 'MetaData']","numberOfRecordPath":[],"recordIdPath":"*/@id","pageSizeParam":[],"pageFromParam":[],"toISOConversion":"schema:iso19115-3.2018:convert/fromPangaea"},"content":{"validate":"NOVALIDATION","importxslt":"none","batchEdits":"[]","translateContent":false,"translateContentLangs":"","translateContentFields":[]},"options":{"every":"0 0 0 ? * *","oneRunOnly":false,"overrideUuid":"SKIP","status":"active"},"privileges":[{"@id":"1","operation":[{"@name":"view"},{"@name":"dynamic"},{"@name":"download"}]}],"ifRecordExistAppendPrivileges":false,"info":{"lastRun":"2024-12-18T09:29:40.276462Z","running":true}}

Zenodo config

{"@id":"446","@type":"simpleurl","owner":["1"],"ownerGroup":["2"],"ownerUser":["undefined"],"site":{"name":"zenodo","uuid":"13dfce70-234f-4a9e-9ec3-698b3d555ca6","account":{"use":false,"username":[],"password":[]},"url":"https://zenodo.org/records/6343858/export/json","icon":"blank.png","loopElement":".","numberOfRecordPath":[],"recordIdPath":"/id","pageSizeParam":[],"pageFromParam":[],"toISOConversion":"schema:iso19115-3.2018:convert/fromZenodo"},"content":{"validate":"NOVALIDATION","importxslt":"none","batchEdits":"[]","translateContent":false,"translateContentLangs":"","translateContentFields":[]},"options":{"every":"0 0 0 ? * *","oneRunOnly":false,"overrideUuid":"SKIP","status":"active"},"privileges":[{"@id":"1","operation":[{"@name":"view"},{"@name":"dynamic"},{"@name":"download"}]}],"ifRecordExistAppendPrivileges":false,"info":{"lastRun":"2024-12-18T08:59:26.505279Z","running":false,"result":{"added":"1","atomicDatasetRecords":"0","badFormat":"0","collectionDatasetRecords":"0","datasetUuidExist":"0","privilegesAppendedOnExistingRecord":"0","doesNotValidate":"0","xpathFilterExcluded":"0","duplicatedResource":"0","fragmentsMatched":"0","fragmentsReturned":"0","fragmentsUnknownSchema":"0","incompatible":"0","recordsBuilt":"0","recordsUpdated":"0","removed":"0","serviceRecords":"0","subtemplatesAdded":"0","subtemplatesRemoved":"0","subtemplatesUpdated":"0","total":"1","unchanged":"0","unknownSchema":"0","unretrievable":"0","updated":"0","thumbnails":"0","thumbnailsFailed":"0"}}}

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

Funded by Ifremer

Zenodo is an "Open repository for EU-funded research outputs from Horizon Europe, Euratom and earlier Framework Programmes.".
PANGAEA is a data publisher. "Our services are open for archiving, publishing, and distributing georeferenced data from earth system research."

Add import from both provider by adding a conversion to ISO19115-3.

Sample input:
* from Zenodo https://zenodo.org/records/6343858, https://zenodo.org/records/6343858/export/json
* from Pangaea https://doi.pangaea.de/10.1594/PANGAEA.820426, https://ws.pangaea.de/oai/provider?verb=GetRecord&identifier=oai:pangaea.de:doi:10.1594/PANGAEA.820342&metadataPrefix=pan_md

Conversions can be used to import or harvest records using the simple URL harvester.

Funded by Ifremer
@fxprunayre fxprunayre added this to the 4.4.7 milestone Dec 18, 2024
@fxprunayre fxprunayre force-pushed the 44-zenodo-and-pangaea-harvester branch from 31a2f5b to c55e21c Compare December 18, 2024 13:30
Copy link

sonarqubecloud bot commented Jan 2, 2025

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant